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In a recently proposed communication system, there would be 
tandem connections of 16 kb/s delta modulators and 2.4 kb/s vocod- 
ers. Preliminary work has indicated that such tandem links would be 
of substantially lower quality than either the delta modulator link or 
the vocoder link alone. The present study, which includes an elabo- 
rate subjective speech quality experiment, confirms this preliminary 
conclusion. It also shows that two other differential waveform coders 
are no better than the proposed delta modulator in tandem links. On 
the other hand, a 5- band sub- band coder does offer substantially 
higher quality than the delta modulator. Still, its performance in 
tandem with the vocoder is poorer than that of the vocoder or the 
sub- band coder alone and is probably of only marginal value for 
practical communication. We have obtained several objective mea- 
sures of speech quality which, for the most part, show relatively little 
correlation with subjective quality. The most successful objective 
predictor of subjective ratings is a linear combination of linear 
predictive coding distances. 

I. BACKGROUND AND INTRODUCTION 

1.1 Background 

Recent plans for United States government digital communication 
networks have focused attention on the compatibility of 2.4-kb/s 
(narrowband) vocoder systems and 16-kb/s (wideband) waveform cod- 
ing schemes. An important question arises in the implementation of 
such a system: If both narrowband and wideband systems are designed 
to provide adequate speech communication individually, will a tandem 
connection of them also function adequately? 
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A recent study of this question, 1, 2 using signal-to-noise ratio (s/n) 
and a spectral distance measure as criteria of merit, has cast doubt on 
the viability of circuits containing a 2.4-kb/s lpc (linear predictive 
coding) vocoder and a 16-kb/s cvsd (continuously variable slope delta 
modulation) waveform coder. In that study, it appeared that cvsd was 
the weak link in these tandem connections. However, the conclusions 
could only be regarded as tentative because the speech material 
included in the study was very limited and because the relationship of 
the objective performance measures to the quality of communication 
experienced by human users was by no means evident. 

1.2 Aims 

In the work reported here, we extend previous results by: 

(i) Studying three 16-kb/s waveform coders in addition to cvsd. 

(ii) Presenting subjective as well as objective performance mea- 
sures. 

(Hi) Greatly enlarging the variety of speech material processed by 
the various communication systems. 

The specific questions addressed in our study are: 

(i) What is the subjective quality of tandem connections of nar- 
rowband and wideband systems, relative to the quality of 
individual systems? 

(ii) Are there alternatives to cvsd that offer higher quality in either 
(or both) individual or tandem performance? 

(Hi) What is the relationship of objective measures of system per- 
formance to subjective assessment of speech quality? 

1.3 An experiment 

To answer these questions we produced, in software on a Data 
General Eclipse computer, a 2.4-kb/s lpc vocoder and four different 
16-kb/s waveform encoders. They are: 

(i) The cvsd studied in Refs. 1 and 2. 

(ii) A double integration version of cvsd, which we call adm (adap- 
tive delta modulator). 

(Hi) A two-bit 8 kHz adpcm (adaptive differential pcm). 

(iv) sbc (sub-band coding) with five separate channels spanning 
the 200 to 3200-Hz band of speech energy. 
Relative to cvsd, adm has virtually the same circuit complexity 
(requiring only one additional resistor and one capacitor), adpcm is 
perhaps 2 to 3 times as complex, and sbc is approximately 10 to 20 
times as complicated. 

The five coding schemes (four waveform coders and lpc) were used 
in 13 different communication systems (five coders individually, four 
waveform coders preceding lpc, four waveform coders following lpc). 
These systems processed a total of 148 speech samples from four 
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talkers (two male and two female) at three different power levels 
(spanning a 30-dB range). 

Twenty-two subjects rated each of the processed speech samples on 
a 9-point scale. Each sample consisted of one sentence of 2 to 3 seconds 
duration, and no sentence was heard more than once by any individual 
subject. Although the subjects were asked to rate overall speech 
quality, it is felt that intelligibility had a greater influence over their 
ratings than it does in experiments in which a few sentences are 
repeated many times. 

In addition, four different objective measures of system quality were 
calculated. These include the s/n and spectral distance measure used 
in Refs. 1 and 2, and also two segmental signal-to-noise ratios 3 that 
have been shown in other work to be more closely related to subjective 
quality than s/n. 4 

Statistical analysis of the subjective data reveal a complicated 
pattern of interactions among the experimental variables. The relative 
performances of the various coding schemes are dependent in a com- 
plicated way on talker and on input level as well as on (individual or 
tandem) system configuration. In spite of the complicated dependence 
of subjective quality on physical conditions, clear patterns in the data 
emerge to answer our original questions. 

Among the individual circuits, sbc has on the average the highest 
subjective quality, followed by lpc, cvsd, adm, and adpcm, in that 
order. Tandem connections all are substantially degraded relative to 
individual circuits. Among the waveform coders, sbc provides the best 
tandem connections with lpc, but the sbc-lpc tandems are substan- 
tially worse than either individual system. The tandems involving the 
other waveform coders are probably inadequate for effective speech 
communication. 

Among the objective measures, s/n as in other studies 3, 4 was found 
to be very poorly correlated with subjective quality. Moreover, with 
the diversity of circuit conditions and speech material presented here, 
the segmental signal-to-noise ratios were also of little use in predicting 
subjective quality. The spectral distance measure was the only one 
that was somewhat useful: a linear regression model based on distance 
measures of both tandem links and on overall distance accounted for 
60 percent of the variance in the average ratings. 

II. SYSTEM DESCRIPTION 
2.1 Overview 

In the narrowband-to- wideband tandem link shown in Fig. 1, the 
input speech appears as 16-bit pcm with 8-kHz sampling rate. It is first 
bandpass filtered to a bandwidth of 200 to 3200 Hz by means of a 6th 
order elliptic bandpass filter. It is then vocoded by the lpc vocoder. At 
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the output of the vocoder, the sampling rate is converted (if necessary) 
by digital techniques 5 to the sampling rate of the waveform coder. This 
conversion has no effect on the tandem connection and is virtually 
"transparent" in terms of quality. The gains G and 1/G before and 
after the coder are used in measuring the dynamic range (i.e., variations 
in performance as a function of signal level) of the waveform coder. 
The output of the coder is lowpass filtered to 3200 Hz, and its sampling 
rate is converted back to 8 kHz and the output signal is processed by 
a 3200-Hz lowpass filter. In Fig. 2, the same signal processing operations 
are shown with the ordering that provides a wideband-to-narrowband 
connection. 

2.2 The narrowband system (LPC) 

The narrowband system consists of a linear predictive coding 
(lpc) system based on an all-pole model of the speech production 
mechanism. The all-pole model implies that, within a frame of speech, 
the output speech sequence is approximated by 



S n = X dkSn-k + G'Ur, 
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Fig. 1 — Narrowband-to-wideband system. 
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Fig. 2 — Wideband-to-narrowband system. 
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where p is the number of poles, u n is the appropriate input, G' is the 
gain, and the a*'s are the lpc coefficients that represent the spectral 
characteristics of the speech frame. For a voiced speech segment, u n is 
a sequence of pulses separated by the pitch period. If the segment is 
unvoiced, pseudorandom white noise is used as input. 

In our study, the lpc coefficients were calculated by the autocorre- 
lation method with p = 12. The analysis was performed every 20 ms 
(50 times/s) with a variable analysis frame size. The frame size was 
proportional to a running average of the pitch period as obtained at 
the pitch detector output. 6 A Hamming window was used prior to the 
lpc analysis. Pitch detection and voiced-unvoiced (V/U) analysis were 
done using the modified autocorrelation method. 

For quantization purposes, the lpc coefficients were converted to 
log area ratio coefficients, which were coded by means of adpcm 
techniques. 8 An overall bit rate of 2.4 kb/s was obtained by allocating 
48 bits to each of the 50 frames per second. Details of the encoding 
scheme are given in the references. 

2.3 Delta modulators, CVSD and ADM 

The experiment includes two delta modulators, cvsd, and a double 
integration version of cvsd which we refer to as adm. Both of them 
can be represented by the block diagram in Fig. 3. Their principal 
difference is in the nature of the signal feedback path which is a single 
integrator in cvsd and a double integrator in adm. 

In both coders, the step size voltage can be generated by an rc 
integrator as described in Ref. 1. The integrator input depends on the 
three most recent output bits. If they are identical, the step size 
increases; otherwise, it decreases. The adaptation equation is 

(2) 



A* +1 = /?A* + (1 - /?)( V k + A mi „), 



where 

A* is the step size at the £th sampling instant, 
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Fig. 3— Block diagram of the cvsd and adm coders. 
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/8 = 0.99 is the step size leakage constant corresponding to an rc 

time constant of 6.4 ms, 

A m in is the minimum step size, and 

V* = Amax when the three most recent outputs are identical and 

Vk — otherwise. 
The dynamic range of the coder is determined by Amax/Amu,, which is 
150 (44 dB) in the cvsd and 256 (48 dB) in the adm. 

2.3.1 CVSD 

As in Ref. 1, the signal feedback loop is a single integrator with a 1- 
ms time constant. The difference equation is 

y*+i = aiy k + H(l-ai)b k bk, (3) 

where 

y k is the integrator output at the £th sampling instant, 

a, = 0.94 is the integrator leakage constant, corresponding to an rc 

time constant of 1 ms, 

H = 3 is the integrator gain, and 

d* = ±1 is the &th output bit. 
In the cvsd, A ma x = 2 dBm, which places the center of the coder 
dynamic range near —21 dBm, the central value of the three signal 
input levels used in the experiment. 

2.3.2 ADM 

The double integration version of cvsd was selected for evaluation 
after a large number of other delta modulators were simulated. The 
other delta modulators differed from cvsd in one or more of the 
following respects: 

(i) An exponential expandor was used in the step-size feedback 
loop to produce the step size 

A*+i = exp(A*+i), 

where A*+i is given by (2). This changes the adaptation from 
essentially additive to essentially multiplicative. 
(ii) The most recent two bits rather than the most recent three bits 
were used to determine whether the step size would increase or 
decrease. 
(Hi) The signal feedback loop contained a double integrator rather 

than a single integrator. 
A limited amount of speech material was processed to evaluate these 
modifications. Segmental s/n was measured for each delta modulator 
configuration and, although some modifications resulted in better 
performance than cvsd for certain input levels, none of them produced 
substantially better results either in terms of dynamic range or peak 
segmental s/n. However, to provide one delta modulation alternative 
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to cvsd, we chose the double integration adm. Double integration is 
known to enhance performance significantly at higher sampling rates 
and to be essentially ineffective in 8-kHz dpcm (see Section 2.4). Our 
purpose here was to assess the effectiveness of a particular double 
integrator in a coder with 16-kHz sampling. 

The double integrator in this coder is a second-order fir filter that 
conforms to the block diagram in Fig. 4. The difference equations of 
the integrator are 



m*+i = v* + if (1 - ot\ - a 2 )6*A* 
yk+i = otiUk+i + a 2 u k , 

where 

Uk is the decoder output, 

v* is the output of the encoder feedback loop, 

ai = 1.38, a 2 = —0.43 are the filter coefficients, and 

H = 10 is the gain. 
The 2-transform of the integrator is 

a,2- ! (l - C32" 1 ) 



(4) 
(5) 



(6) 



a - c lZ - 1 ) a - c 2 z- i y 

where the integrator poles are related to the filter coefficients by 

Ci + c 2 - ai CiC 2 = -a 2 (7) 

and the zero is 

c 3 = — ofe/ati. (8) 

The corresponding real poles and zero of the filter frequency response 
are 
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Fig. 4 — Double integrator circuit for the adm coder. 

TANDEM CONNECTIONS OF WAVEFORM CODERS 607 



where T = 1/16000 s in a 16-kb/s delta modulator. With en = a 2 = 1.38, 
-0.43, the pole frequencies are 200 and 2000 Hz, and the zero is at 3500 
Hz, so that the integrator frequency response is approximately that 
shown in Fig. 5. In the adm, A max = -5 dBm, which approximately 
centers the coder dynamic range at —21 dBm. 

2.4 ADPCM 

Figure 6 shows the block diagram of the adpcm system. In an error- 
free environment, the primed quantities at the receiver are equal to 
the corresponding ones at the transmitter. In the encoder error sample 
e(k) is generated as the difference between the input speech sample 
s(k) and a predicted sample s(k). After quantization with 2 bits/ 
sample, the prediction error at both receiver and transmitter is added 
to the predicted sample to give the reconstructed sample r(k). The 
predicted sample s(k) is derived from the previous reconstructed one, 
r(k-l), by a first-order transversal predictor: 

s(k) = 0.78r(k - 1). (10) 

The coefficient 0.78 was computed by the usual mean-square error 
minimization technique 9 under the hypothesis of an overall s/n of 
about 10 dB. 

The 2-bit coding of the prediction error is effected by means of the 
adaptive step size A(A), which is computed as proportional to a short- 
time estimate a(k) of the absolute magnitude of the quantizer input. 
The estimate a{k) is computed recursively from the decoded prediction 
error d(k) so that no side information has to be transmitted. The 
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Fig. 5 — Frequency response of the double integrator circuit. 
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Fig. 6 — Block diagram of the adpcm coder. 

entire adaptation process is summarized by the following two equa- 
tions: 



Mk) = Ca(k) 
a(k) = aa(k - 1) + (1 - a) | d(k - 1) | 



(11) 
(12) 



In eq. (11), the parameter C, the load-constant, determines in the 
steady state the magnitude of the average step-size and hence the 
amount of granular noise and overload distortion. In eq. (12), the 
parameter a determines the speed of response of the adaptation 
algorithm to input level changes: a relatively small value of a produces 
fast response but an inaccurate estimate in steady state. 

For simulating a practical implementation, the step size A (k) was 
constrained to assume values in the range (A min , A max ) with 



= 256. 



(13) 



Values of C, a, and A m i n were calculated by optimizing a prediction of 
the subjective opinion score, obtained from separate measures of 
granular noise and overload distortion. The integrator coefficient was 
a = 0.875, corresponding to a time constant of 1 ms. The minimum 
step size which produced the same degradations at the high and low 
end of the input level range of interest was found to be —63 dBm. The 
maximum step size is -15 dBm, while the rms speech input level 
assumes values in the range —36 dBm to —6 dBm. 
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2.5 The sub-band coder 

Sub-band coding is a waveform coding technique in which the 
speech band is partitioned into typically 4 or 5 sub-bands by bandpass 
filters. Each sub-band is then lowpass-translated to dc, sampled at its 
Nyquist rate, and then digitally encoded using adaptive pcm (apcm) 
encoding. By this process of dividing the speech band into sub-bands, 
each sub-band can be preferentially encoded according to perceptual 
criteria for that band. On reconstruction, sub-band signals are decoded 
and bandpass-translated back to their original bands. They are then 
summed to give a replica of the original speech signal. 

A particularly attractive implementation of the sub-band coder in 
terms of hardware is based on an integer band sampling approach. 10 " 12 
With this approach, the modulations to lowpass at the transmitter and 
to bandpass at the receiver are inherent in the sampling process. The 
implementation is illustrated in Fig. 7. Bandpass filters BP\ to BPn in 
the transmitter and receiver serve to partition the input speech into N 
sub-bands. The coders and decoders encode the sub-band signals and 
the multiplexer combines these digital signals and synchronizing bits 
into a single bit stream for transmission over the digital channel. 

Table I shows the choice of bands and bit allocations used in the 16 
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Fig. 7 — Block diagram of the sub-band coder. 



Table I — 16 kb/s 5-band sub-band coder 



Band 



Band 
Edges 
(Hz) 



Sampling 
Freq 
(Hz) 



Min. 

Step-size 

(dB) 



Bit 
Allocation 



Kb/s 



1 


178-356 


356 


(Ref) 


4 


1.42 


2 


296-593 


593 





4 


2.37 


3 


533-1067 


1067 





a 


3.20 


4 


1067-2133 


2133 


-3 


2 


4.27 


5 


2133-3200 


2133 


-8 


2 


4.27 


Sync 










0.47 
16.00 



610 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1 979 



kb/s coder. The coder is a 5-band design which was proposed in Ref. 

II. Column 2 shows the frequency range covered by each sub-band. 
The bit allocation refers to the number of bits/sample used by the 
coders in each sub- band. As seen from the table, more accuracy is 
allowed for encoding the lower bands for reasons explained in Ref. 11. 

The frequency range of the coder extends from 200 to 3200 Hz. A 
plot of the frequency response, shown in Fig. 8, reveals small notches 
at 1067 and 2133 Hz. These notches are due to the transition bands of 
the filters in bands 4 and 5. Subjectively, they are not very perceptible. 
Bands 1 to 3 are overlapped to avoid such notches at lower frequencies. 
The filters are sharp cutoff, 200 tap, fir filters. 

Column 4 in Table I contains the minimum step sizes of the apcm 
coders, expressed in decibels, relative to the minimum step size of band 
1. This choice of minimum step sizes is different than that suggested 
in Ref. 11 and was found to give a better matching of the dynamic 
ranges of the sub-bands. 

III. OBJECTIVE MEASUREMENTS 

Four different objective measurements were made on the waveform 
coders. They are conventional signal-to-noise ratio, snr, two types of 
segmental signal-to-noise ratios, segI and seg2, and an lpc spectral 
distance measure, D. In addition, D was used to evaluate the perform- 
ance of the lpc vocoder and the tandem connections of the waveform 
coders and the lpc vocoder. In this section we briefly define each of 
these objective measures. 
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Fig. 8 — Frequency response of the sub-band coder. 
TANDEM CONNECTIONS OF WAVEFORM CODERS 



611 



3. 1 Conventional SNR 

The most commonly used measure of performance of digital coders 
has been the conventional signal-to-noise ratio evaluated over an 
utterance of speech. The speech power is defined as 

p = E* 2 (m), (14) 



and the noise power is defined as 

n = Z(x(m)-y(m)) 2 , 



(15) 



where x(m) and y(m) are the input and output signals of the coder, 
respectively, and the summations in (14) and (15) are taken over the 
entire speech utterance. The conventional s/n is then defined as 

snr = 10 log(p/n). (16) 

In measuring the input and output signals of the coders, it is generally 
desirable to compensate for the effects of lowpass or bandpass filtering. 
This is done by the arrangement shown in Fig. 9. The input speech 
signal s(m) is coded to form the output speech signal y{m). It is also 
filtered with the same filters used in the coder to generate a compen- 
sated reference signal x(m) which is used as the input signal in (14) 
and (15). snr is, therefore, strictly a measure of coder distortions and 
is not affected by bandlimiting or group delay in the filters. 

3.2 SEG1 

While snr is the most widely used criterion in measuring coder 
distortion, it has also long been known that in many situations it does 
not correlate well with subjective performance. 13 Another definition of 
signal-to-noise ratio, however, recently proposed by Noll, 3 does appear 
to correlate better with subjective performance. This measure is based 
on s/n measurements made over segments of speech which are typi- 
cally about 20 ms in duration. An average over all of the segments in 
the speech utterance is then taken to obtain a composite measure of 
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Fig. 9— Circuit for measuring signal-to-noise ratios. 
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performance for the entire utterance. If (s/n), corresponds to the 
signal-to-noise ratio in decibels for a segment, i [computed in the same 
manner as in (16)], the segmental s/n (segI) is then defined as 

1 N 
seg1=- X (s/n), (dB), (17) 

iV ,_] 

where it is assumed that there are N 20 ms segments in the speech 
utterance. 

Problems arise in this definition of segmental s/n when intervals of 
silence exist in the speech utterance. In segments where the input 
signal x(n) is essentially zero, any slight noise will give rise to large 
negative (s/n)„ and these segments may unduly dominate the average 
in (17). To prevent this anomaly, we first identify those segments 
which correspond to silence and exclude them from the average in 
(17). This is achieved by means of a simple threshold. Letp, represent 
the (average) speech energy in a segment, i, so that 

1 K 
Pi=„ 1 x 2 (m), (18) 

A ,„_i 

where K corresponds to the number of speech samples in the segment. 
Then the segment will be included in the computation of segI in (17) 
if its energy exceeds a threshold T, i.e., if p, > T. If it does not exceed 
this threshold, it is not included in the average in (17). Furthermore, 
to prevent any one segment from dominating the average we also limit 
the value of (s/n), to a range of -10 to +80 dB. That is, -10 < (s/n), 
s£ 80 dB. In computer simulations, the 16-bit wordlength admitted 
signal levels in the range ±32767 and we set T to 900, corresponding to 
-55 dBm. 

3.3 SEG2 

The definition of this measure is 

1 N 
seg2 = - £ 10 logiod + Pi/hi) (dB), (19) 

where p, is the signal power in segment i and ii, is the noise power in 
segment i. They are defined (on segments) according to eqs. (14) and 
(15), respectively. 

Unlike SEGl, seg2 does not have any thresholds. It is self-limiting to 
a lower value of dB due to the addition of the constant 1 inside the 
logarithm. As in the segI measure, seg2 uses 20 ms segments. 

3.4 LPC distance measure 

The fourth objective measure was the lpc distance proposed by 
Itakura. 14 The distance between two frames of speech with lpc coef- 
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ficient vectors a and a, and with autocorrelation matrices V and V is 
defined as 



Di = d(a, a) = log 



aVa' 



(20) 



where a and a, are (p + 1) component vectors and V and V are (p + 
1) X (p + 1) matrices, where p is the order of the lpc analysis. 

Di~is a measure of the distance between frames of speech. This 
distance, however, does not satisfy exactly all the properties of a true 
distance metric, i.e., 



tf(a, a) ^ d(a, a). 



(21) 



However, for cases when d(a, a) is small, the inequality of eq. (21) is 
almost an equality. To compensate for this lack of symmetry, it is 
convenient to define a second distance, D 2 , as 



D 2 = rf(a, a) = log 



aVa' 
aVa' 



(22) 



and an average distance between the two frames is now given by 



(23) 



Equation (23) defines a true distance metric which can be used to 
measure the distance (dissimilarity) between two frames of speech. It 
can readily be shown 15 that either D, or D 2 can be expressed in terms 
of spectral differences between the lpc models for the two frames of 

speech. 

D\ and D 2 were measured for every utterance used in the tests to be 
described later. They were measured on a frame-by-frame basis and 
averaged across the entire utterance to give an overall lpc distance 
between the original and processed version of a sentence. Figure 10 
shows the system used to measure lpc distance for a single coder. The 
box labeled delay was used to compensate any delay inherent in the 
coder, and the bandpass filters were used both to eliminate out-of- 







CODER 




200-3200 
BPF 




LPC 

DISTANCE 
MEASURE 










INPUT 
SPEECH 






slm) 




DELAY 


200-3200 
BPF 











Fig. 10— Circuit for measuring lpc distance measures. 
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band quantization noise generated in the coder and to guarantee that 
the bandwidths of both the original and coded utterances were the 
same. 

3.5 Comparison of SEG and D 

Figures 11 and 12 show a series of plots for two of the utterances 
used in the experiments (encoded with the adpcm coder). Part a of 
each figure shows the rms energy of the utterance as a function of 
time (frame number), part b of each figure shows the segmental s/n 
versus frame number, and part c of each figure shows the lpc distances 
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Fig. 1 1 — Objective measurements as a function of time for utterance A. (a) rms energy 
of the input signal, (b) Segmental s/n-SECl. (c) lpc distance. 
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FRAME NUMBER 



Fig. 12— Objective measurements as a function of time for utterance B. (a) rms energy 
of the input signal, (b) Segmental s/n-SEGl. (c) lpc distance. 



(both Di and D 2 ) versus time. The utterance of Fig. 11 had an average 
lpc distance of about 0.60, whereas the utterance of Fig. 12 had an 
average lpc distance of 0.97. It can be seen in both figures that most 
of the time Di « D 2 ; however when either D\ or D 2 was large, the 
differences between D\ and D 2 were often significant. It can also be 
seen in these figures that the lpc distance and the segmental s/n do 
not measure similar types of distortion— i.e., when the segmental s/n 
is small (indicating temporal distortion of the waveform) the lpc 
distance is not necessarily large (indicating spectral distortion of the 
signal). Finally, it can be seen that a large degree of variation (on a 
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frame-by-frame basis) occurs with both the segmental s/n and the lpc 
distance. Thus, a single number which characterizes the "distortion" 
across the entire utterance may have little meaning in many cases. 

IV. EXPERIMENTAL DESIGN 

4. 1 Circuit conditions 

The experiment tested 37 different communication circuits, each 
characterized by three parameters: direction, coder, and level. There 
were three directions: (i) single link with a waveform coder or vocoder 
alone, (ii), LPC-to- waveform, as in Fig. 1, (Hi) waveform-to-LPC as in 
Fig. 2. There were five different single links, four with waveform coders 
and one with a vocoder. Each waveform coder was tested with speech 
at three different input levels, —36 dBm, —21 dBm, and —6 dBm. The 
corresponding settings of the gain parameter, G, were 0.178, 1.00 and 
5.62, respectively. The speech level at the vocoder input was always 
—21 dBm. Thus, there were 13 single link configurations, in all. Each 
of the other two directions had 12 circuit configurations, comprising 
all combinations of four waveform coders and three input levels. 

4.2 Speech material 

For this experiment, a substantial digital speech library was pre- 
pared. Four talkers, two male and two female, read 40 different 
sentences, 2 to 3 seconds long, each talker reading from a different 
phonetically balanced list. The talkers were seated in a sound-proof 
booth and spoke into a high-quality dynamic microphone. The ampli- 
fied microphone signal was lowpass-filtered at 3.2 kHz, sampled and 
converted into digital form by a 16-bit A/D converter operating at 8- 
kHz sampling frequency and finally written onto a magnetic disk. All 
the sentences were digitally equalized to the mean power level of —21 
dBm. 

For each of the 37 circuit conditions, sentences spoken by each of 
the four talkers were processed, generating a total of 148 stimuli. 
Different sentences were used in each case so that in the set of 148 
stimuli the same sentence was never heard twice. With this format, we 
speculate that intelligibility of the processed speech played an impor- 
tant role in determining quality judgments. In tests containing a few 
sentences, presented many times, each sentence becomes recognizable 
to subjects even in conditions severe enough to make it quite unintel- 
ligible at first hearing. It is our hypothesis that, in such tests, there is 
a lower correlation between intelligibility and subjective quality than 
in the tests reported here. 

4.3 Procedure 

The 148 stimuli were recorded in different random orders on 4 
analog tapes. Twenty-two students from the junior and senior classes 
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of local high schools served as paid subjects. They listened to the 
processed speech monaurally over Pioneer se 700 earphones at 80 dB 
spl while seated in a double-walled sound booth with frequency- 
weighted room noise introduced at a level of 50 dBA. The total 
listening time for each group of subjects was about 30 minutes, with a 
short break after the 80th sentence. After each stimulus, the subjects 
had 4 seconds for recording their judgments. They were instructed to 
rate the quality of the stimulus by checking on their answer sheets a 
value between 1 and 9, using 1 for the worst conditions, 9 for the best 
ones, and intermediate numbers for intermediate qualities. Before the 
actual test, the subjects listened to 12 practice sentences, different 
from those used in the experiment, spanning the range of quality in 
the experiment. 

V. SPEECH QUALITY RESULTS 

Variability in the subjective and objective measures of quality of the 
148 processed speech samples can be attributed to several (variable) 

sources, namely: 

(i) The "direction" of the circuit: LPC-to-waveform coder, wave- 
form-coder-to-LPC, or single link. 

(ii) The waveform coder. 

(Hi) The speech level at the input to the coder. 

(iv) The talker. 

(v) The sentence. 

(vi) The listener (subjective data only). 
(vii) Inconsistency of each listener (subjective data only). 
We are primarily interested in how the first two variables, circuit 
direction and waveform coder, influence quality. Inferences about 
these variables would be simple if they accounted for most of the 
variance in the data or if they did not interact substantially with the 
other variables. Unfortunately, neither of these conditions is met by 
our data, and many of our inferences about circuit direction and 
waveform coder will be more qualified than we would like them to be. 

5. 1 Subjective data 

5.1.1 Listeners 

The amount of listener agreement was fairly low relative to other 
speech quality experiments. 313 For each pair of listeners, we computed 
the correlation coefficient of the 148 ratings. The median of the 
correlations was only 0.49. The 25th and 75th percentiles were 0.41 
and 0.60, respectively. With respect to the 148 mean ratings (averaged 
over the 22 listeners), individual listener correlations ranged from 0.50 
to 0.85, which suggests that no subject was very idiosyncratic in his 
ranking of stimulus conditions. 
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Figure 13 gives plots of the rating scores of each of the 22 subjects 
for the lpc system alone and for each of the four talkers. The large 
variability among subjects is readily seen. For example, for talker 3 
the average rating was 7.3. However, two subjects gave this circuit 
rating of 1 (the lowest possible), whereas 13 subjects gave it a rating of 
8 or 9 (the highest possible). Similar variability was found in the scores 
for almost every test condition. 

The 148 listener averages are presented in Table II, where we also 
provide aggregates of these averages across input level and talker. The 
aggregated mean values show the overall effects of circuit direction 
and waveform coder. 

5.1.2 Sentences 

In many subjective testing experiments, listeners hear one or a few 
sentences repeatedly. To achieve closer conformity to practical com- 
munication situations, a different sentence for each stimulus condition 
was used. A disadvantage of this design is the lack of any control for 
or means of testing the effect of sentence content on the quality 
measures. The variability due to sentences appears in and enhances 
the experimental error; i.e., the variance that cannot be accounted for 
in statistical analyses. 
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Fig. 13 — Rating scores as a function of subject for the individual lpc circuit for each 
of the four talkers. 
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5.1.3 Talker effects 

Averaged over all 37 circuit conditions (combinations of input level, 
coder, and direction), the ratings of speech from the two male talkers 
were 4.98 and 4.94. The averages for the two females were 3.99 and 
4.00. A three-way analysis of variance (listener by talker by circuit 
condition) revealed a very significant talker effect. Clearly, this effect 
is predominantly due to listeners giving lower ratings to distorted 
female speech than to distorted male speech. However, there is also a 
substantial talker-circuit interaction, indicating that differences in 
ratings of male and female speech are by no means uniform across 
experimental conditions. (In fact, with fairly low distortion as in sub- 
band coding in a single link, the male and female averages are virtually 
the same— 7.13 and 7.20, respectively.) This nonuniformity is evident 
in Table II which also reveals that, although the overall ratings of the 
two males are virtually identical, there are substantial differences from 
condition to condition in the ratings of the male voices, and likewise 
for the female voices. 

5. 1.4 Input level 

The step sizes of the waveform coders were adaptable over a range 
of 44 dB (for the cvsd) or 48 dB (for the other three coders). With the 
rms input level varying over a range of 30 dB and individual sounds 
within a sentence exhibiting a wide range of levels, the weak sounds of 
the low-level signals were subject to greater-than-average granular 
quantizing noise, while the strong sounds of the high level sentences 
were susceptible to overload. The maximum and minimum step sizes 
of each waveform coder were chosen with the aim of centering the 
dynamic range of subjective quality in the -15 to +15 dB range of 
input levels. 

Table III shows that this design effort was entirely successful with 
the cvsd and adpcm coders in which the dynamic range of subjective 
performance is exactly symmetric around the 0-dB input level. In the 
sbc and adm coders, the overload distortion of the +15 dB input level 
was less harmful subjectively than the granular noise produced with 
the input set at -15 dB. In these coders, a better balance of granularity 
and overload would have been achieved with lower minimum step 
sizes. 

5. 1.5 Coder and direction 

We have used the Tukey HSD criterion 16 to evaluate the relative 
merits of the 13 communication system configurations listed in Fig. 14. 
Figure 14a shows, for each circuit direction, groupings of coders for 
which the null hypothesis cannot be rejected at the 0.05 level. In all 
cases, sbc is superior to any of the other waveform coders. In the lpc 
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Table III — Average subjective ratings 
(over listeners, talkers and direction) 











Coder 






SBC 


CV8D 


ADPCM 


ADM 


Level 


-15 dB 

OdB 

+15 dB 


5.0 
5.5 
6.0 


4.1 
4.1 
4.1 


3.9 
4.3 
3.9 


3.6 
4.3 
4.4 






SINGLE-LINK 


LPC-WAVEFORM 


WAVEFORM-LPC 








SBC 


SBC 


SBC 








X'lpc N 


S" ADM -"v^ 


ADPCM 








y^cvscT^/ 


[ CVSD 


J yOvDM^N 








S<^ADM^>/ 


V^^DPCI^-/ 


\^CVSD^y 








V ADPCM J 









(a) 



SBC CVSD ADPCM ADM 

SI NGLE LIN K SI NGLE LIN K SINGLE LINK SI NGLE LIN K 

LPC-WAVEFORmX /^LPC-WAVEFORmN WAVEFORM-LPC /LPC-WAVEFORM 

WAVEFORM-LPC v'V WAVEFORM-LPC y' LPC-WAVEFORM V WAVEFORM-LPC 



(b) 

Fig. 14 — Relative subjective quality of coding systems. Circles indicate that it is 
impossible to reject the hypothesis that the coders have the same quality. 



— » waveform circuits, adm, cvsd, and adpcm have essentially the same 
performance. In the waveform -» lpc direction, adpcm is better than 
adm and cvsd, which exhibit essentially the same quality. 

Figure 14b shows the equivalent groupings across direction. The 
salient inferences from these groupings is that, for each waveform 
coder, the single link substantially outperforms either of the tandem 
connections. The two tandem directions have essentially the same 
quality when sbc, cvsd, or adm is the waveform coder. The adpcm 
— » lpc tandem is significantly better than the lpc — > adpcm tandem. 

5.2 Objective measurements of quality 

Results of the objective measurements discussed in Section III are 
presented in Tables IV and V. Table IV gives results for the perform- 
ance of the single-link circuits in terms of snr, SEGl, seg2, and lpc 
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Table IV — Objective measurements of single link coders 



-15 


+15 
-15 


+15 
-15 


+15 
-15 


+15 



Talker: Ml 



Talker: Fl 



Level Coder snr segI seg2 D Level Coder snr segI seg2 D 



sbc 

SBC 

SBC 

CVSD 

CVSD 

CVSD 

ADPCM 

ADPCM 

ADPCM 

ADM 

ADM 

ADM 

LPC 



13.2 
14.4 
13.8 
12.1 

8.4 

3.5 
12.6 
10.1 

3.6 
14.6 
12.3 

4.7 



11.9 
13.4 
10.2 
10.2 
12.1 
8.3 
12.8 
12.3 
10.6 
10.5 
13.2 
10.7 



8.7 

8.8 

8.6 

9.1 

11.0 

9.8 

10.3 

12.0 

12.0 

9.5 

11.1 

12.2 



0.82 
0.50 
0.80 
0.65 
0.54 
0.54 
0.54 
0.34 
0.37 
0.64 
0.47 
0.47 
0.31 



-15 


+15 
-15 


+15 
-15 


+15 
-15 


+15 



SBC 

SBC 

SBC 

CVSD 

CVSD 

CVSD 

ADPCM 

ADPCM 

ADPCM 

ADM 

ADM 

ADM 

LPC 



17.7 
14.4 
13.2 
14.8 
17.5 

8.2 
16.2 
16.5 

8.2 
16.5 
18.4 
10.4 



15.3 
13.3 
14.9 
12.1 
15.2 
11.8 
16.2 
14.0 
12.6 
14.7 
16.5 
12.7 



12.3 0.53 

10.5 0.51 

12.0 0.34 

11.6 0.84 

13.8 0.65 
12.3 0.55 

13.9 0.61 
13.9 0.50 

13.1 0.58 
12.6 0.89 
14.9 0.70 
13.0 0.48 

0.38 



-15 


+15 
-15 


+15 
-15 


+15 
-15 


+ 15 



Talker: M2 



Talker: F2 



Level Coder snr segI seg2 D Level Coder snr segI seg2 D 



sbc 
sbc 

SBC 

CVSD 

CVSD 

CVSD 

ADPCM 

ADPCM 

ADPCM 

ADM 

ADM 

ADM 

LPC 



12.3 

13.8 

12.3 

12.0 

9.4 

4.4 

12.8 

12.0 

5.1 

14.1 

11.6 



10.6 

14.5 

14.1 

9.9 

9.6 

9.9 

13.0 

13.8 

11.3 

10.5 

12.1 

10.7 



7.6 
10.8 
11.1 

8.1 

9.4 
11.5 
10.5 
13.4 
13.0 

8.2 
11.5 
11.6 



0.76 
0.38 



-15 




0.27 +15 



0.81 
0.43 



-15 





0.43 +15 



0.50 
0.45 
0.47 
0.73 
0.38 



-15 



+15 

-15 





0.35 +15 
0.35 



SBC 

SBC 

SBC 

CV8D 

CVSD 

CVSD 

ADPCM 

ADPCM 

ADPCM 

ADM 

ADM 

ADM 

LPC 



14.6 
13.4 
12.2 
12.8 
11.1 

4.3 
14.5 
14.3 

4.5 
16.5 
17.8 

8.9 



14.1 
14.1 
13.8 
11.5 
14.9 
11.3 
14.8 
14.9 
11.2 
12.5 
16.4 
13.1 



12.0 
11.5 
12.1 
10.0 
12.5 
12.1 
12.5 
14.5 
12.6 
10.5 
14.7 
12.6 



0.60 
0.33 
0.28 
0.74 
0.62 
0.40 
0.64 
0.38 
0.67 
0.66 
0.58 
0.51 
0.47 



Table V — Overall LPC distances for tandem links 



First 
Link 



Second 
Link 



Talker: Ml Talker: Fl Talker: M2 Talker: F2 



LPC 


SBC 


1.18 


0.93 


0.83 


0.81 


LPC 


SBC 


0.20 


0.73 


0.672 


0.66 


LPC 


SBC 


0.61 


0.57 


0.46 


0.56 


LPC 


CVSD 


0.91 


0.92 


0.88 


1.10 


LPC 


CVSD 


0.66 


1.00 


0.54 


0.91 


LPC 


CVSD 


0.60 


0.66 


0.64 


0.83 


LPC 


ADPCM 


0.79 


0.90 


0.69 


1.09 


LPC 


ADPCM 


0.71 


0.75 


0.53 


0.68 


LPC 


ADPCM 


0.61 


0.72 


0.58 


0.76 


LPC 


ADM 


0.90 


1.40 


0.82 


0.91 


LPC 


ADM 


0.68 


0.80 


0.68 


0.77 


LPC 


ADM 


0.59 


0.64 


0.60 


0.90 


SBC 


LPC 


1.50 


1.08 


0.86 


0.93 


SBC 


LPC 


1.05 


0.88 


0.78 


0.78 


SBC 


LPC 


0.71 


0.57 


0.61 


0.62 


CVSD 


LPC 


0.91 


0.86 


0.98 


1.21 


CVSD 


LPC 


0.62 


0.75 


0.63 


0.84 


CVSD 


LPC 


0.62 


0.82 


0.56 


0.89 


ADPCM 


LPC 


0.75 


0.76 


0.71 


0.79 


ADPCM 


LPC 


0.63 


0.80 


0.67 


0.87 


ADPCM 


LPC 


0.51 


0.74 


0.60 


0.76 


ADM 


LPC 


0.79 


0.71 


0.86 


1.06 


ADM 


LPC 


0.58 


0.81 


0.64 


0.80 


ADM 


LPC 


0.47 


0.62 


0.56 


0.69 
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distance D for each of the four talkers used in the experiment. Due to 
the large variability of the objective measures across talkers and 
sentences (a different sentence was used for each condition), it is 
difficult to make meaningful comparisons across conditions. A similar 
variability was observed for the objective measurements across indi- 
vidual coders in the tandem links. 

Table V gives results for lpc distance for the overall tandem links. 
Again, a large variability is seen across conditions due to the different 
sentences used for each measurement. 

5.3 Relationship of subjective and objective measures 
5.3. 1 Correlations 

Previous studies have demonstrated the inadequacy of snr as an 
indicator of subjective quality and have pointed to segmental signal- 
to-noise ratio and to lpc distance metrics as more promising measures. 
In the present experiment, the diversity of speech material and of 
signal-processing approaches exceed those of previous studies, and 
thus the merits of single measures and combinations of measures as 
subjective quality indicators are tested more critically than ever before. 

Table VI shows correlations of average rating with each of the 
objective measures. The subscripts A, B, and AB, appended to snr, 
segI, seg2, and D, refer to measures taken on the first link of a tandem 
circuit (or the entire single-link circuit), the second link of a tandem 
circuit, and the overall circuit, respectively. 

Table VI indicates that the diversity of conditions either eliminates 
or dilutes the value of each of the measures as a predictor of speech 
quality. The table gives correlations of average rating (over 22 subjects) 
with each one of the objective measures. There are nine objective 
measures; 3 s/n's and one lpc distance for each half of a tandem 
connection, and the overall lpc distance. Except for D A , the lpc 
distance of the first link, and Dab, the overall lpc distance, none of the 
measures is applicable to all conditions. (For example, s/n is measured 
only in the first link in the single-link and waveform-to-LPC circuits. It 
is measured only in the second link in the LPC-to- waveform circuits.) 
In addition to the correlation, the table shows the number of data 
points used in the computation and the significance (two-tailed) of the 
null hypothesis that the coefficient is zero. 

It should be noted that, for all talkers, the only statistic for which 
the null hypothesis can be rejected at the 0.01 level is Dab, the overall 
distance. The two-tailed significance level for seg 2 a is 0.001, but the 
correlation is negative. Surely a one-tailed test applies here, and the 
null hypothesis cannot be rejected. Computing correlations for ratings 
of male and female talkers separately, we see the same situation, 
except that segIb is significant at the 0.01 level as a predictor of male 
speech quality on the LPC-to-waveform tandems. 
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The poor correlations of practically all s/n measures with subjective 
quality has led us to abandon all of them as performance indicators of 
the tandem circuits and to focus our attention on the lpc distance 
measures. 

5.3.2 Prediction of subjective quality 

Working with the lpc distance measure for the first link of a tandem 
connection, Da, the distance measure for the second link, Db, and Dab, 
the overall distance measure, linear regression procedures, were ap- 
plied to find formulas for predicting the average ratings, R , of the 148 
circuit conditions. The best linear combination of the three distances 
was 

R = -5A8D A - 6A7D B + 2.52D A b + 7.38. (24) 

The standard deviation of the 148 mean ratings was 1.55 units on the 
9-point scale and the standard error of this regression was 1.10. The 
proportion of variance accounted for is thus 51 percent, and the 
multiple correlation coefficient is 0.712. 

The prediction accuracy can be improved somewhat by accounting 
for the fact that ratings and lpc distances are related differently for 
male and female talkers. We have done so by introducing a new 
variable, M, where M = 1 for male talkers and M = for female 
talkers. Introducing M to the regression, we have 

R = -4.99A* - 5.98Db + 2.14Dab + 0.48M + 6.85. (25) 

Here the standard error is 1.08, i.e., 53 percent of the variance is 
accounted for and the multiple correlation coefficient is 0.727. 

Various transformations of the distance data were also studied and 
a simple log transform proved useful in regression equations. We define 
the transform variables 

L A = \n(D A ); Lab = ]n(D AB ) 

and 

Lb = lnCDs) in tandem circuits and 

= —4.0 in single-link circuits. 

The value —4.0 has been chosen empirically. (It corresponds to a 
distance of 0.018. The lowest measured distance was 0.21, which was 
observed for several sentences processed by lpc.) 

Using the log-transformed distances, the regression equations cor- 
responding to (24) and (25) are 

R = -1.551m - 0.785L B - 0.21 ILab + 1.59 (26) 

and 
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R = -1.341* - 0.782L B - 0.0622L AB + 0.643M + 1.51. (27) 

The standard error of (26) is 1.02, which accounts for 58 percent of the 
variance in average ratings and the multiple correlation coefficient is 
0.760. The corresponding statistics for (27) are 0.973, 62 percent, and 
0.785. 

VI. DISCUSSION 

These data analyses allow us to make generalized statements in 
answer to the three questions posed in Section 1.2. Owing to the 
interactions in the data, there are specific exceptions to many of the 
general conclusions of the following subsections. 

6. 1 Quality of tandem connections 

A strong conclusion of the study is that any tandem connection of 
the vocoder is substantially worse than either of the two corresponding 
single links. Although we did not attach descriptive adjectives to rating 
categories, we have the impression that ratings below about 4.0 re- 
flected degradations severe enough to render a circuit inadequate for 
effective communication. 

In our judgment, the results of this experiment strongly suggest that 
a tandem connection involving any of the three differential waveform 
coders (cvsd, adpcm, or adm) is inadequate. It appears that the lpc- 
sbc tandem could provide reasonable communication in many circum- 
stances, but that the sbc-lpc tandem is of marginal use. 

6.2 Alternatives to CVSD 

Only the sub-band coder, which is substantially more complicated, 
offers significantly better performance than cvsd over all circuit con- 
ditions, talkers, and input levels, adm, a double integration version of 
cvsd, has the same subjective quality (within the bounds of experi- 
mental error) and adpcm is better than cvsd in one tandem direction, 
equal to cvsd in the other tandem direction and worse than cvsd in 
the single link configuration. The adpcm coder was designed by 
extrapolating, to 16 kb/s, results of an experiment involving 24 kb/s 
and 32 kb/s coders. The result of this design optimization was a coder 
that adapts somewhat more slowly than adpcm coders used elsewhere. 
It may be that higher quality could be obtained with a faster adaptive 
quantizer in the adpcm coder. 

6.3 Objective measures 

The wide variety of circuit conditions and speech material either 
destroyed or strongly diluted the value of the objective measures as 
indicators of speech quality. With the wide range of input levels, the 
outputs of differential waveform coders contained various types of 
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additive noise and signal distortion. Meanwhile, the sub-band coder 
and lpc each have their own peculiar distortions; a reverberant effect 
and a mechanical buzziness, respectively. The presence of all these 
impairments in the single link circuits and their combinations in the 
tandem circuits together present a diversity of quality that would be 
very hard to describe with a single measure. 

While the wide range of circuit conditions produces great subjective 
variability, the variety of speech material seems to have a strong effect 
on the objective measures. We speculate that sentence-to-sentence 
fluctuation in objective measures is greater than that of corresponding 
subjective impressions. 

These irregularities led to regression formulas of considerably less 
accuracy [about 60 percent of variance accounted for by eqs. (28) and 
(29)] than the 70 to 90 percent obtained in other studies. 413 Our work 
lends support to the value of current efforts to find more robust 
objective measures. 17 " 19 
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